Speech Recognition of Foreign Out-o Hierarchical Lang
نویسندگان
چکیده
This paper proposes a new speech recognition scheme for foreign out-of-vocabulary words embedded in native-language speech. To recognize foreign names frequently observed in news speech or in translation speech, we adopted a hierarchical language model that had been successfully applied to OOV words covering native vocabularies. In this hierarchical language model, OOV vocabularies are modeled as a word-class model in the upper-layered model, and their statistical phonotactic constraints are modeled in the lower-layered model. Since extra statistics are needed to cover foreign words and their pronunciation differences, we have introduced two techniques. The first is to combine translation target language models and translation source statistics of OOVs using the hierarchical language model. The second is to automatically generate recognition target pronunciations from original pronunciations by syllable-to-syllable mapping. To confirm the validity of this recognition scheme, we have conducted speech recognition experiments using English speech including Japanese personal names as OOV words. The proposed method outperformed the existing algorithm using a lexicon consisting of all the words in the training set. Surprisingly, it achieved better OOV recognition results than the non-OOV condition where all the proper names in the test set are registered in the lexicon.
منابع مشابه
Automatic Detection of Foreign Accent for Automatic Speech Recognition
Recognition of foreign accented speech remains among the most difficult tasks in automatic speech recognition. It was observed that using models trained on foreign data together with native models improves the recognition for speakers with foreign accent. However such an approach degrades the recognition performances on native speakers. In order to avoid such performance degradation the degree ...
متن کاملPragmalinguistic and Sociopragmatic Recognition of High and Low Level EFL Learners
This study investigated the effects of English as foreign language (EFL) proficiency on what the authors of this study called pragmalinguistic and sociopragmatic recognition of EFL learners. To elicit the data, the study used two types of pragmatic measures: a pragmalinguistic recognition (PLR) test and a sociopragmatic recognition (SPR) test. Both tests were developed by the researchers of thi...
متن کاملPragmalinguistic and Sociopragmatic Recognition of High and Low Level EFL Learners
This study investigated the effects of English as foreign language (EFL) proficiency on what the authors of this study called pragmalinguistic and sociopragmatic recognition of EFL learners. To elicit the data, the study used two types of pragmatic measures: a pragmalinguistic recognition (PLR) test and a sociopragmatic recognition (SPR) test. Both tests were developed by the researchers of thi...
متن کامل[jɑːmes] or [dʒɛɪmz] or Perhaps Something In- between? Recapping Three Years of Xenophone Studies
This paper summarises work on ‘xenophones’ (foreign sounds) carried out at Telia Research. The inclusion of “foreign” sounds in Swedish is described, as well as their implications on speech recognition and speech synthesis. Results from two earlier studies are summarised and described: the nature of the expansion of what is normally regarded as the Swedish phone set, and the nature of some poss...
متن کاملThai Spelling Recognition Using a Continuous Speech Corpus
Spelling recognition is an approach to enhance a speech recognizer’s ability to cope with incorrectly recognized words and out-of-vocabulary words. This paper presents a general framework for Thai speech recognition enhanced with spelling recognition. In order to implement Thai spelling recognition, Thai alphabets and their spelling methods are analyzed. Based on hidden Markov models, we propos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006